23 research outputs found

    Multi-view Metric Learning in Vector-valued Kernel Spaces

    Full text link
    We consider the problem of metric learning for multi-view data and present a novel method for learning within-view as well as between-view metrics in vector-valued kernel spaces, as a way to capture multi-modal structure of the data. We formulate two convex optimization problems to jointly learn the metric and the classifier or regressor in kernel feature spaces. An iterative three-step multi-view metric learning algorithm is derived from the optimization problems. In order to scale the computation to large training sets, a block-wise Nystr{\"o}m approximation of the multi-view kernel matrix is introduced. We justify our approach theoretically and experimentally, and show its performance on real-world datasets against relevant state-of-the-art methods

    Entangled Kernels-Beyond Separability

    Get PDF
    Publisher Copyright: © 2021 Microtome Publishing. All rights reserved.We consider the problem of operator-valued kernel learning and investigate the possibility of going beyond the well-known separable kernels. Borrowing tools and concepts from the field of quantum computing, such as partial trace and entanglement, we propose a new view on operator-valued kernels and define a general family of kernels that encompasses previously known operator-valued kernels, including separable and transformable kernels. Within this framework, we introduce another novel class of operator-valued kernels called entangled kernels that are not separable. We propose an efficient two-step algorithm for this framework, where the entangled kernel is learned based on a novel extension of kernel alignment to operator-valued kernels. We illustrate our algorithm with an application to supervised dimensionality reduction, and demonstrate its effectiveness with both artificial and real data for multi-output regression.Peer reviewe

    Automatic identification of land cover types from satellite data with machine learning techniques

    Get PDF
    This study is part of the TEKES funded Electric Brain -project of VTT and University of Helsinki where the goal is to develop novel techniques for automatic big data analysis. In this study we focus on studying potential methods for automated land cover type classification from time series satellite data. Developing techniques to identify different environments would be beneficial in monitoring the effects of natural phenomena, forest fires, development of urbanization or climate change. We tackle the arising classification problem with two approaches; with supervised and unsupervised machine learning methods. From the former category we use a technique called support vector machine (SVM), while from the latter we consider Gaussian mixture model clustering technique and its simpler variant, k-means. We introduce the techniques used in the study in chapter 1 as well as give motivation for the work. The detailed discussion of the data available for this study and the methods used for analysis is presented in chapter 2. In that chapter we also present the simulated data that is created to be a proof of concept for the methods. The obtained results for both the simulated data and the satellite data are presented in chapter 3 and discussed in chapter 4, along with the considerations for possible future works. The obtained results suggest that the support vector machines could be suitable for the task of automated land cover type identification. While clustering methods were not as successful, we were able to obtain as high as 93 % accuracy with the data available for this study with the supervised implementation.Tutkielma on osa TEKES-rahoitteista VTT:n ja Helsingin yliopiston Electric Brain -projektia, jonka tarkoituksena on kehittää tekniikoita automaattiseen suurien datamäärien käsittelyyn. Tämä työ keskittyy tutkimaan potentiaalisia menetelmiä automaattiseen maanpeittotyyppien tunnistukseen aikasarjaluonteisesta sateliittidatasta. Tällaiset automaattiset seurantamentelmät olisivat hyödyllisiä erilaisten luonnon- ja muiden ilmiöiden tarkkailuun; mahdollisia seurantakohteita ovat esimerkiksi metsäpalot, urbaanien alueiden kehittyminen ja ilmastonmuutoksen aiheuttamien muutosten tarkkailu. Lähestymme luokitteluongelmaa kahdesta lähtökohdasta: ohjatun ja ohjaamattoman koneoppimisen menetelmillä. Ensimmäisestä kategoriasta käytämme tekniikkaa nimeltä tukivektorikone, kun taas jälkimmäisessä keskitymme klusterointiin Gaussisilla sekoitemalleilla ja niiden yksinkertaisemmalla versiolla, k-means -menetelmällä. Esittelemme työssä käytettävät tekniikat ja motivaatiota työlle kappaleessa yksi. Tarkemmin nämä tekniikat käsitellään kappaleessa kaksi, jossa myös esitellään työss\ä käytettävä data, sekä simuloitu data joka on luotu tekniikoiden toimivuuden testaamiseksi. Tulokset sekä simuloidulla että oikealla datalla esitellään kappaleessa kolme. Keskustelemme tuloksista ja mahdollisista laajennoksista työlle kappaleessa neljä. Saadut tulokset viittaavat siihen, että tukivektorikone voisi olla soveltuva menetelmä tämäntyyppiseen sateliittidatan analysointiin. Korkein saavutettu tarkkuus tukivektorikoneilla maanpeittotyyppejä luokitellessa oli 93 %, joka oli huomattavasti parempi kuin klusterointimenetelmillä saavutetut tulokset

    Cross-view kernel transfer

    Full text link
    We consider the kernel completion problem with the presence of multiple views in the data. In this context the data samples can be fully missing in some views, creating missing columns and rows to the kernel matrices that are calculated individually for each view. We propose to solve the problem of completing the kernel matrices with Cross-View Kernel Transfer (CVKT) procedure, in which the features of the other views are transformed to represent the view under consideration. The transformations are learned with kernel alignment to the known part of the kernel matrix, allowing for finding generalizable structures in the kernel matrix under completion. Its missing values can then be predicted with the data available in other views. We illustrate the benefits of our approach with simulated data, multivariate digits dataset and multi-view dataset on gesture classification, as well as with real biological datasets from studies of pattern formation in early \textit{Drosophila melanogaster} embryogenesis

    Partial Trace Regression and Low-Rank Kraus Decomposition

    Full text link
    The trace regression model, a direct extension of the well-studied linear regression model, allows one to map matrices to real-valued outputs. We here introduce an even more general model, namely the partial-trace regression model, a family of linear mappings from matrix-valued inputs to matrix-valued outputs; this model subsumes the trace regression model and thus the linear regression model. Borrowing tools from quantum information theory, where partial trace operators have been extensively studied, we propose a framework for learning partial trace regression models from data by taking advantage of the so-called low-rank Kraus representation of completely positive maps. We show the relevance of our framework with synthetic and real-world experiments conducted for both i) matrix-to-matrix regression and ii) positive semidefinite matrix completion, two tasks which can be formulated as partial trace regression problems

    Opetusmateriaali Flexicult® Vet-viljelymaljan käytöstä eläinklinikoiden hoitohenkilökunnalle

    Get PDF
    Virtsatieinfektiot kissoilla ja koirilla ovat yleisiä syitä eläinlääkärin hoitoon hakeutumiselle. Virtsatieinfektion diagnosoinnissa viljely on tutkimusmenetelmistä yleisin. Virtsan infektoivat bakteerit hyödyntävät maljan kasvualustaa, ja kasvavat maljalla luoden silminnähtäviä pesäkkeitä. Maljan pesäkkeiden bakteerit tunnistetaan, ja eläin lääkitään kyseiseen bakteeriin tehoavalla antimikrobisella lääkkeellä, antibiootilla. Bakteereilla, niin ihmisissä kuin eläimissäkin, on kyky kehittää niiden vastustuskykyä lääkeaineita kohtaan. Bakteerien jatkuvasti kehittyvä vastustuskyky, eli resistenssi tuo haasteita bakteeri-infektioiden lääkehoitoon. Mitä vastustuskykyisempiä bakteerit ovat, sitä vaikeampi on kehittää uusia tehoavia lääkkeitä. Moniresistentit bakteerikannat ovat yleistyneet viime aikoina, ja se on ajankohtainen globaali ilmiö. Flexicult® Vet on virtsanviljelymalja kissoille ja koirille. Malja on kromogeeninen agarmalja, eli maljalla kasvava bakteerikanta ilmentää sille ominaista väriä. Pesäkkeen värin perusteella pystytään arvioimaan bakteerin lajityyppi. Maljalla on lisäksi antibioottilokerot, joilla pystytään arvioimaan bakteerin herkkyys tietylle antibiootille. Maljalla pystytään arvioimaan herkkyys viidelle eri lääkeaineelle. Malja mahdollistaa antibioottiherkkyysmäärityksen viljelyn ohessa, mikä säästää klinikoille aikaa ja vaivaa. Opinnäytetyö on toiminnallinen, jonka tuotoksena tehtiin opetusvideo, sekä kaksi pikaohjetta. Aihe saatiin viljelymaljan maahantuojalta, Triolab Oy:ltä. Opetusvideo on tarkoitettu eläinklinikoiden työntekijöille maljan käytön perehdytykseen. Pikaohjeet toimivat perehdytetyn työntekijän työn tukena. Opinnäytetyön lopputuotokset auttavat Triolabia perehdyttämään työntekijöitä viljelymaljan käyttöön visuaalisesti. Pikaohjeet auttavat työntekijää muistin tukena eri työvaiheissa.Urinary tract infections in cats and dogs are a common cause for seeking into medical care. Culturing urine is one of the most commonly used method for diagnosing a UTI. In-fectious bacteria in the urine exploit the agar to benefit their growth on the plate, and grow into visibly noticeable colonies. The colonies on the agar plate get identified, and the animal is medicated with a specific antimicrobial medicine, also known as antibiotics. Bacteria, as well as in humans as in animals, have the ability to develop resistance against medical components. The bacteria’s ability to tolerate antibiotics causes difficulty in treating infections. The more developed the bacteria’s resistance gets in time, the harder it is to develop an affective cure. Appearance of bacteria species with multiple drug resistance are more common due this day, and it is a current global phenomenon. Flexicult® Vet is a urine culture plate designed for cats and dogs. The plate is a chromogenic agar plate, which means the growing colonies of the bacteria implement a certain color based on its species. Based on the color of the colony, the species of the bacteria can be determined. The agar plate is also divided into five compartments that contain different types of antibiotics. With these compartments it is possible to determine the bacteria’s susceptibility for those five medical components. The agar plate allows culturing and susceptibility testing simultaneously, saving time and money for the clinics. Our thesis was practice-oriented, with a video and two instruction sheets as a result. The topic for the thesis came from a company called Triolab, which imports the Flexicult® Vet-urine culture plate. The learning video is targeted for the employees in animal clinics, to give them information about the new agar plate. The instruction sheets are a helpful guide for already oriented employees. The results of this thesis help Triolab Inc. workers to perform their orientations in animal clinics. Orientation helps the clinic’s employees to use and interpret the results on the agar plate reliably. Our instruction sheets work as an easy-access guideline for the clinic’s employees

    Kernel learning for structured data : a study on learning operator - and scalar - valued kernels for multi-view and multi-task learning problems

    No full text
    Aujourd'hui il y a plus en plus des données ayant des structures non-standard. Cela inclut le cadre multi-tâches où chaque échantillon de données est associé à plusieurs étiquettes de sortie, ainsi que le paradigme d'apprentissage multi-vues, dans lequel chaque échantillon de données a de nombreuses descriptions. Il est important de bien modéliser les interactions présentes dans les vues ou les variables de sortie.Les méthodes à noyaux offrent un moyen justifié et élégant de résoudre de problèmes d’apprentissage. Les noyaux à valeurs opérateurs, qui généralisent les noyaux à valeur scalaires, ont récemment fait l’objet d’une attention. Toujours le choix d’une fonction noyau adaptée aux données joue un rôle crucial dans la réussite de la tâche d’apprentissage.Cette thèse propose l’apprentissage des noyaux comme une solution à problèmes d’apprentissage automatique de multi-tâches et multi-vues. Les chapitres deux et trois étudient l’apprentissage des interactions entre données à vues multiples. Le deuxième chapitre considère l'apprentissage inductif supervisé et les interactions sont modélisées avec des noyaux à valeurs opérateurs. Le chapitre trois traite un contexte non supervisé et propose une méthode d’apprentissage du noyau à valeurs scalaires pour compléter les données manquantes dans les matrices à noyaux issues d’un problème à vues multiples. Dans le dernier chapitre, nous passons à un apprentissage à sorties multiples, pour revenir au paradigme de l'apprentissage inductif supervisé. Nous proposons une méthode d’apprentissage de noyaux inséparables à valeurs opérateurs qui modélisent les interactions entre les entrées et de multiples variables de sortie.Nowadays datasets with non-standard structures are more and more common. Examples include the already well-known multi-task framework where each data sample is associated with multiple output labels, as well as the multi-view learning paradigm, in which each data sample can be seen to contain numerous descriptions. To obtain a good performance in tasks like these, it is important to model the interactions present in the views or output variables well.Kernel methods offer a justified and elegant way to solve many machine learning problems. Operator-valued kernels, which generalize the well-known scalar-valued kernels, have gained attention recently as a way to learn vector-valued functions. The choice of a good kernel function plays crucial role for the success on the learning task.This thesis offers kernel learning as a solution for various machine learning problems. Chapters two and three investigate learning the data interactions with multi-view data. In the first of these, the focus is in supervised inductive learning and the interactions are modeled with operator-valued kernels. Chapter three tackles multi-view data and kernel learning in unsupervised context and proposes a scalar-valued kernel learning method for completing missing data in kernel matrices of a multi-view problem. In the last chapter we turn from multi-view to multi-output learning, and return to the supervised inductive learning paradigm. We propose a method for learning inseparable operator-valued kernels that model interactions between inputs and multiple output variables

    Entangled Kernels - Beyond Separability

    No full text
    International audienceWe consider the problem of operator-valued kernel learning and investigate the possibility of going beyond the well-known separable kernels. Borrowing tools and concepts from the field of quantum computing, such as partial trace and entanglement, we propose a new view on operator-valued kernels and define a general family of kernels that encompasses previously known operator-valued kernels, including separable and transformable kernels. Within this framework, we introduce another novel class of operator-valued kernels called entangled kernels that are not separable. We propose an efficient two-step algorithm for this framework, where the entangled kernel is learned based on a novel extension of kernel alignment to operator-valued kernels. We illustrate our algorithm with an application to supervised dimensionality reduction, and demonstrate its effectiveness with both artificial and real data for multi-output regression

    Noyaux à valeurs opérateurs et apprentissage de métriques multi-vues

    Get PDF
    International audienceNous considérons le problème d'apprentissage de métriques dans un contexte multi-vues, et présentons une nouvelle méthode qui apprend des métriques entre les vues dans des espaces à noyaux à valeurs opérateurs permettant de capturer la structure multimodale des données. Nous formulons ce problème comme un problème d'optimisation convexe et nous proposons un algorithme itératif dont l'objectif est d'apprendre conjointement les métriques et le classificateur ou le régresseur. Afin de faire de réduire le coût calculatoire, une approximation Nyström par blocs de la matrice à noyaux multi-vues est introduite. Des expériences sur des données artificielles et réelles ont été réalisées pour évaluer l'algorithme proposé

    Intrication et noyaux à valeurs opérateurs

    No full text
    International audienc
    corecore